Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

نویسندگان

  • Alexandre Berard
  • Olivier Pietquin
  • Christophe Servan
  • Laurent Besacier
چکیده

Current speech translation systems integrate (loosely or closely) two main modules: source language speech recognition (ASR) and source-to-target text translation (MT). In these approaches, source language text transcript (as a sequence or as a graph) appears as mandatory to produce a text hypothesis in the target language. In the meantime, deep neural networks have yielded breakthroughs in different areas including machine translation and speech recognition. Current systems have an “encoder-decoder” architecture where a sequence of input symbols (or an input signal) is projected into a continuous low dimensional space and the output sequence is generated from this representation. This type of architecture has been proposed for machine translation (Sutskever et al., 2014; Bahdanau et al., 2015) and for automatic speech recognition (Chorowski et al., 2015; Chan et al., 2016; Zweig et al., 2016). If we exclude some pioneering attempts to directly translate from a source phone sequence to a target word sequence (Besacier et al., 2006) or to discover a bilingual lexicon from parallel phone-word sequences (Godard et al., 2016), the only attempt to translate directly a source speech signal into target language text is that of Duong et al. (2016). However, the authors focus on the alignment between source speech utterances and their text translation without proposing a complete end-to-end translation system. The same authors (Anastasopoulos et al., 2016) also propose to jointly use IBM translation models (IBM Model 2), and dynamic time warping (DTW) to align source speech and target text, but again, only the alignment performance is measured in their work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Systemic Functional Linguistics as a Tool of Text Analysis for Translation

Translation, ipso facto, is an understanding and a transferal of meaning from one language into another. Therefore, it may be fitting to conclude that a suitable semantic theory should underpin any attempt to that end. This paper advocates implementing Systemic Functional Linguistics (henceforth SFL) which subscribes to a view of language as a "meaning-potential". In fact, Halliday and Matthies...

متن کامل

On the Translation Quality of Google Translate: With a Concentration on Adjectives

Translation, whose first traces date back at least to 3000 BC (Newmark, 1988), has always been considered time-consuming and labor-consuming. In view of this, experts have made numerous efforts to develop some mechanical systems which can reduce part of this time and labor. The advancement of computers in the second half of the twentieth century paved the ground for the invention of machine tra...

متن کامل

An Investigation on the Relationship between the Grammatical Competence of Young Iranian English Translation Students and their Ability to Translate from English to Farsi

     Today, everything has changed and this has brought a need for learning a second language. Most countries across the world use English as their second/foreign language and the fundamental part of this process is grammar, i.e., the combination of sound, structure, and meaning system of language. A sentence can be composed of several words, clauses, as well as grammatical rules. These grammat...

متن کامل

Sequence-to-Sequence Models Can Directly Translate Foreign Speech

We present a recurrent encoder-decoder deep neural network architecture that directly translates speech in one language into text in another. The model does not explicitly transcribe the speech into text in the source language, nor does it require supervision from the ground truth source language transcription during training. We apply a slightly modified sequence-to-sequence with attention arc...

متن کامل

Study of Vinay and Darbelnet’s Seven Translation Strategies in Four Translations of Divorce Surah of Quran

This research study aimed to show what strategies the translators used in their translations ofDivorce Surah of the Holy Quran. The model adopted by the researcher is based on Vinay andDarbelnet’s (1958) and Munday’s (2008) concept of cohesion. To this end, two Persian translationsby Elahi Ghomshei (2015) and Foladvand (2014) and two English translations by A. J. Arberry(2007) and Yusuf Ali (19...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1612.01744  شماره 

صفحات  -

تاریخ انتشار 2016